Implement the emitter
In this part of the assignment, we will implement an x86-64 emitter for our language.
This part is largely open-ended, with some guidance below. The class materials can also be helpful!
Overview
The emitter will take as input the AST and produce as output a string that contains equivalent x86-64 instructions, in AT&T syntax.
A program in our language can do one of two things:
- Print a single string literal. In this case, the program should print the string and call the exit system call with an argument of
0. - Evaluate a mathematical expression. In this case, the program should call the exit system call with an argument whose value is the result of evaluating the expression.
Implementation guidance
Printing
For printing, use the system call for writing to standard out, along with a .data section that defines the string literal.
Expression evaluation
For evaluating the expression, generate x86 instructions that evaluate each part of the subexpression. The following algorithm can turn an AST into a list of instructions that evaluate the same expression as the AST.
If the AST is a single integer literal, emit an instruction that assigns that value to register \(r_1\).
If the AST is an arithmetic operation:
- Recursively emit the left subtree of the operation. The result should then be in \(r_1\).
- Emit an instruction that pushes \(r_1\) onto the stack.
- Recursively emit the right subtree of the operation. The result should then be in \(r_1\).
- Emit an instruction that pushes \(r_1\) onto the stack.
- Emit an instruction that pops from the stack into \(r_2\).
- Emit an instruction that pop from the stack into \(r_1\).
- Emit an instruction that perform the operation, leaving the result in \(r_1\).
Assembler directives
To produce a complete x86 program that can run on a machine, we will need a few other pieces of assembly. For these pieces, we will use the syntax of a particular assembler, the gnu as assembler.
The pieces consist of directives (commands to the assembler) and symbols (named values). A label is a special kind of symbol that provides a name for a location in the program. Note that x86 instructions can use symbols as immediate values.
.global-
The
.globaldirective makes a symbol visible to the linker. _start:-
This label should appear before the first instruction in our program. The assembler expects the entry point to our program to be labelled with
_start. .data-
The
.datadirective tells the assembler that the next section of the file will define data that the program uses. If our program prints a string literal, the contents of the string literal will go in this section. .ascii <string-literal>-
The
.asciidirective describes a (non-nul-terminated) string literal. len = . - <label>-
This state assigns the length of the string literal to a symbol named
len, by computing the offset between the current address (.) and the address of a specific<label>(e.g.,_startormsg).
Putting it all together, our full program will be:
.global _start
.text
_start:
...<instructions from the emitter>...
.data
msg:
.ascii <string-literal>
len = . - msg
where <string-literal> is the string literal from our source program, including the quotation marks.
Testing the emitter
For now, the easiest way to test your emitter is probably to copy / paste the output of the compiler into the x86-64 playground.